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© Microbial expression of a chimeric gene is used to 
produce a polypeptide comprising the amino acid sequence 
of human proinsulin, or an analog thereof differing in the "C" 
chain portion. A polypeptide so produced contains a se- 
quence of additional amino add units sufficient in number to 
protect it from bacterial proteases, and has a cleavage site 
e.g. a methionine residue adjacent the sequence of amino 
acid units corresponding to the proinsulin or proinsulin 

rjl analog. Cleavage at this site (e.g. by CNBr) generates 
promiulm (or the analog) which is treated in vitro to form the 
disulfide bonds between the "A" and "B" chain proteins 

m characteristic of human insulin. The "C" chain portion is then 
excised enzymatically to yield human insulin useful e.g. in 

w ' the treatment of diabetes. 

^ The chimeric gene may be synthesised from oligonu- 
w cleotides and inserted into a plasmid which is used to 
q transform a host cell, e.g. £ eo/L 
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HUMAN PROINSULIN AND ANALOGS THEREOF 
AND METHOD OF PREPARATION BY MICROBIAL 
POLYPEPTIDE EXPRESSION AND 
CONVERSION THEREOF TO HUMAN INSULIN 

Rel ated Appl ications 

This application is related to and incorporates by 
reference the disclosures of European Patent Application 
Publications Nos. 0001930 (A.N. 78300597.8) and 0036776 
{A.N. 81301227.5). 

Field of the Invention 

This invention relates to microbial expression of 
polypeptides. In one aspect, it relates to the preparation of 
genes for the microbially expressible production of 
intermediates useful in the preparation of human insulin. In 
another aspect, it relates to the preparation of human 
proinsulin or analogs thereof differing from human prolnsulin 
1n the."C - chain portion. In yet another aspect, it relates to 
the preparation of human insulin from the prepared human 
proinsulin or an analog thereof. 



Background of tho Inv»nM n „ 

Olabetes, the human condition characterized by a failure of 
the pancreas to generate the polypeptide hormone insulin ln 
sufficient entities. In severe cases at least. ,s currently 
treated by injection of i„ s „, fn derlved froffl tb- pancreas ^ 
slaughtered animals. Bovine and porcine insulin, in 
Particular, are used for this purpose. 

The use of tnsulfn derived from animals is unsatisfactory 
from at least two standpoints. i„ the first place, the 
extraction of insulin from the pancreas of slaughtered animals 
is a complex process that requires large quantities of the 
organs. Secondly, and more importantly from the diabetic's 
Point of view, the insulin derived from animal sources is not 
chemically identical to human t„ S u,i„. differing ,„ the 
sequence of peptide units. Furthermore, it sometimes contains 
non-homologous animal hormones. s„ch as the corresponding 
proinsulin. albeit in small quantities. As a result, the 
response of patients treated with animal derived insulin is not 
as satisfactory as desired. For example, an immune response to 
animal insulin is believed to be a source of chronic 
complications in certain treatments of diabetes. 

Accordingly, there has gone unfilled a long felt need to 
nave a source of insulin identic,, chemically to human insulin 
^contaminated by other biologically active impurities 1„ 
amounts sufficient to permit diabetics to be treated 
economically. Complicating this task is the complex chemical 
structure of human insulin. Structurally it has two 
Polypeptide chains referred to as the "A" and "B" chains bound 
to each other by disulfide bonds. The.A chain, some 21 amino 
acid units in length, is bound (crossl inked) to the B chain a 
^ain of 30 amino acids, through disulfide bonds between units 
of the amino acid cysteine in each chain. 
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To be maximally effective In humans, the amino acid units 
of insulin must be precisely ordered to correspond to that 
produced 1n vivo . However, the complexity of the molecule 1s 
such that conventional methods of chemical syntnesis are 
unsulted to i ts >reparati on , on a commercial scale at least. 

Insulin 1s produced in vivo 1n the pancreas in the form of 
preproinsulin. Preproi nsul i n is a polypeptide comprising the 
21 units of the A chain, the 30 units of the B chain, a 
bridging or connecting chain of 35 units referred to as the C 
chain and a 24 amino acid "presequence" (Met Ala Leu Trp Met 
Arg Leu Leu Pro Leu Leu Ala Leu Leu Ala Leu Trp Gly Pro Asp Pro 
Ala Ala Ala) attached to the N-terminal phenylalanine amino 
acid beginning the B chain. Proinsulin, lacking the 
presequence is shown in Figure 1 with a methionine amino acid 
in place of the presequence. This presequence may participate 
in secretion from the cells in which it is produced. As the - 
preproinsulin is excreted from the islet cells on the pancreas, 
the presequence is excised to leave the proinsulin chain. This 
chain folds to a structure in which three disulfide bonds are 
formed, two of which are between the A and B chain segments of 
the proinsulin. The connecting C chain is then excised 
proteolytically to leave a residue which is insulin, consisting 
of the A and B chains bound together by the disulfide bonds. 

This application describes a method for obtaining human 
insulin and human proinsulin and analogs thereof which differ 
from human proinsulin In the sequence of amino acids making up 
the C chain. The method utilizes the burgeoning recombinant 
DMA technology. The following discussion of elements of the 
technology provide background to the detailed description of 
the invention. 

With the advent of recombinant DNA technology, the 
controlled bacterial production of useful polypeptides has 
become possible. Already in hand are bacteria modified by this 
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technology to permit the production of such polypeptide 
products as somatostatin (K. Itakura et al^., Science 198 , 1056 
(1977), the (component) A and B chains of human insulin (D.V. 
Goeddel et aK , Proc. Nat'l. Acad. Sci. USA T6 t 106 (1979)) and 
human growth hormone (O.V. Goeddel et aj^., Nature 281^, 544 
(1979)). Such is the power of the technology that virtually 
any useful polypeptide may be bacterially produced, putting 
within reach the controlled manufacture of hormones, enzymes, 
antibodies, and vaccines useful against a wide variety of 
diseases. The cited materials, which describe in greater 
detail the representative examples referred to above, are 
incorporated herein by reference, as are other publications 
referred to infra , to illuminate the background of the 
invention. % 

The work horse of recombinant DNA technology is the 
plasraid, an extra-chromosomal loop of double-stranded DNA found 
in bacteria, oftentimes in multiple copies per bacterial cell. 
Included in the information encoded in the plasmid DNA is that 
required to reproduce the plasmid in daughter cells (i.e., a 
"repl icon") and ordinarily, one or more selection 
characteristics, such as resistance to antibiotics, which 
permit clones of the host cell containing the plasmid of 
interest to be recognized and preferentially grown in selectivi 
media. The utility of plasmids, which can be recovered and 
isolated from the host microorganism, lies in the fact that 
they can be specifically cleaved by one or another restriction 
endonuclease or "restriction enzyme", each of which recognizes 
a different site on the plasmidic DNA. Thereafter heterologou 
genes or gene fragments may be inserted into the plasinid by 
endwise joining at the cleavage site or at reconstructed ends 
adjacent the cleavage site. 
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As used herein, the term "heterologous" refers. to a. gene 
not ordinarily found in. or a polypeptide sequence ordinarily 
not produced by, the host mi croorga-ni sm whereas the terra, 
•homologous" refers to a gene or polypeptide which is produced 
in the host microorganism, such as-E.. coll . DMA recombination 
is performed outside the microorganisms but the resulting 
"recombinant" plasmid can be introduced. into .microorganisms by 
a process known as transformation and large quantities of the 
heterologous gene-containing recombinant plasmid obtained by 
growing the trans-f ormant . Moreover., where, the gene is . properly 
inserted with reference to portions of Jthe plasmid which govern 
the -transcription and translation of the encoded DNA message, 
the resulting plasmid or "expression vehicle", when 
incorporated into the host microorganism, directs the 
production of the polypeptide sequence for which the inserted 
gene codes, a process referred to. as expression. 

Expression is initiated in a region known as. the promoter 
which is recognized by, and bound by RNA polymerase. In some 
cases, as.. in the trp operon discussed infra , promoter 
regions are overlapped by "operator" regions to form a combined 
promoter-operator. Operators are DNA sequences which are 
recognized by so-called repressor proteins which serve to 
regulate the frequency of transcription initiation at a 
particular promoter. The polymerase travels along the DNA, 
transcribing the information contained in the coding strand 
from its 5' to 3' end into messenger RNA which is in turn 
translated into a polypeptide having the amino acid sequence 
for which the DNA codes. Each amino acid is encoded by a 
unique nucleotide triplet or "codon* within what may for 
present purposes be referred to as the "structural gene", i.e. 
that part which encodes the amino acid sequence of the 
expressed product. After binding to the promoter, the RNA 
polymerase first transcribes nucleotides encoding a ri.bosorae 
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blnding site, then a translation initiation or "start" signal 
(ordinarily ATG, which in the resulting messenger RNA becomes 
AUG), then the nucleotide codons within the structural gene 
itself. So-called stop codons are transcribed at the end of 
the struc-.ural gene whereafter the polymerase may form an 
additional sequence of messenger RNA which, because-.of the 
presence of the stop signal, will remain untranslated by the 
ribosomes. Ribosomes bind to the binding site provided on the 
messenger RNA. in bacteria ordinarily as the mRHA is being 
formed, and themselves produce the encoded polypeptide, 
beginning at the translation start signal and ending at the 
previously mentioned stop signal. The desired product is 
produced if the sequences encoding the ribosome binding site 
are positioned properly with respect to the AUG initiator codon 
and if all remaining codons follow the initiator codon in 
phase. The resulting product may be obtained by lysing the 
host cell and recovering the product by appropriate 
purification from other microorganism protein. 

Polypeptides expressed through the use of recombinant OIIA 
"technology may be entirely heterologous, as in the case of the 
direct expression of human growth hormone, or alternatively may 
comprise a heterologous polypeptide and, fused thereto, at 
least a portion of the amino acid sequence of a homologous 
peptide, as in the case of the production of intermediates for 
somatostatin and the components of human insulin. In the 
latter cases, for example, the fused homologous polypeptide 
comprised a portion of the amino acid sequence for beta 
galactosidase. In those cases, the intended bioactive product 
is bioinactlvated by the fused, homologous polypeptide until 
the latter is cleaved away in an extracellular environment. 
Fusion proteins like those just mentioned can be designed so as 
to permit highly specific cleavage of the precursor protein 
from the intended product, as by the action of cyanogen brooid. 
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on methionine, or alternatively by enzymatic cleavage. See, 
eg-, G .8 • Patent Publication Ho. 2 007 676 A. 

Hunan insulin has hitherto been obtained employing 
techniques of recombinant ONA technology. The process used a 
synthetic gene for the A chain which Is expressed in £. col i 
and a separate synthetic gene for the B chain which is 
expressed in another JE. col 1 . O.Y. Goeddel e_t aK , Proc. Nat. 
Acad. Sci., USA, 76 , 106 (1979). The two chains are obtained 
as chimeric polypeptides (proteins) comprising the desired 
sequence of amino acids (either the A or 8 chain sequence) 
bound to another section of carrier polypeptide designed to 
protect the desired sequence from proteases in the £. col.i . 
The chimeric proteins have a selective cleavage site adjacent 
the desired polypeptide sequence of the A or B chain which 
permits separation of the desired sequence from the carrier 
polypeptide. Isolation of the two sequences is followed by the 
formation in vitro of the disulfide bonds. 

This process is necessarily complicated by the fact that 
two distinct genetically modified bacterial strains must be 
obtained and maintained. Further, the prior process requires 
separate isolation of the A and B chain and the crosslinking of 
the two chains by means of the formation of the disulfide bonds 
without the aid in orientation provided by an intact C chain. 

The present application describes a process for the 
construction of a single gene to express, in a single 
microorganism, a chimeric protein which includes a complete 
human proinsulin polypeptide or an analog thereof differing 
from human Insulin In only the amino acid sequence of the C 
chain. Human insulin can be cleanly excised from these 
polypeptides after in vitro formation of the disulfide 
crosslinks between the A and 0 chains. 
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By this process, proinsulin. and analogs thereof, can b e 
directly obtained in substantially pure form and free of 
biologically active impurities. Similarly, the proinsulins can 
be effectively processed to obtain insulin chemically identical 
to human insulin similarly free of biologically active 
impurities thus promising a more effective treatment of human 
diabetes than possible using animal derived insulin. 

Summary of the Invention 

The present invention provides a method for obtaining human 
insulin by means of a chimeric polypeptide comprising the polypeptide 
sequence of human proinsulin, or an analog thereof differing fron the 
polypeptide sequence of human proinsulin in the sequence of amino 
acids comprising the C chain, fused to additional protein or 
protein fragment, there being a selective cleavage site which 
permits cleavage of the proinsulin or its analog from the additional 
protein or fragment. The cleaved proinsulin product may then be 
caused to orient by formation of the characteristic insulin A 
and B chain disulfide crosslinks and the crosslinked insulin 
•precursor may then be excised from the "C" chain carrier. 

The -c- chain (or, hereinafter, bridging chain) of amino 
acid units of the analog proinsulins made according to the 
present invention may comprise as few as 2 amino acid units. 
The identity and sequence of amino acid units intermediate the 
ends of the bridging chains in analogs of human proinsulin are 
not particularly significant. However, the end units thereof 
must be units which permit facile excision of the bridging 
chain from the A and B chains of human insulin. Preferably, 
excision occurs after the proinsulin molecule has been cleaved 
from the addition protein of the chimeric polypeptide and. most 
preferably, after the thus cleaved proinsulin molecule has been 
folded and the disulfide links between the A and B chains 
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charac teristic of human insuli-q have, been formed. 

Preferably, the bridging chain has sites which permit its J 
excision by enzymatic means. Preferred for this, purpose 

are Arg-Arg and Lys-Arg uirits on the. bridging chain which are j 
adjacent the terminal -COOH end of the B chain ajid terminal 

! 

-NH 2 end of the A chain, respectively, as is foond in human 
proinsulin itself. Also preferred are two Arg-Arg units 
between the B and A chains. 

The chimeric proteins of the present invention are obtained 
by expression of a heterologous structural gene for the 
proinsulin or analog in a recombinant microbial cloning vehicle 
in which the gene is in reading phase with a DNA sequence 
coding for an addition protein portion of the chimeric protein 
and -the cleavage site. In preferred embodiments of the 
invention, the cleavage site is methionine at the N-terminal of 
the proinsulin which permits cleavage using cyanogen bromide. 

The additional protein can vary but the. .preferred 
additional protein is methionine amino acid or the . presequence 
of preproinsul in or a portion thereof or ■ B-gal actosi dase 

or a substantial portion- thereof or a portion of the amino acid • 
sequence encoded by a fragment of the trp leader polypeptide I 
gene fused to a portion of the trp E polypeptide gene or the 
trp 0 polypeptide. The added, ultimately superfluous portion ! 
of the chimeric protein is selected to provide protection from 
bacterial proteases which might otherwise digest the proinsulin. 

The preferred recombinant microbial cloning vehicle is a 
modified, preferably bacterial, plasmid containing the 
structural gene in a reading phase with the ONA sequence which 
codes for the added, superfluous portion of the chimeric 
polypeptide and the selective cleavage site. 

The manner in which these and other objects and advantages 
of the invention are achieved will be apparent to those skilled 
in the art after consideration of the following description of 
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the preferred embodiments and the illustrations of Figs. 1-6. 

Brief Description of the Drawings 

Figure 1 shows the nucleotide sequence of a gene for human 
proinsulin. 

Figure 2 illustrates a scheme for obtaining a plasmid 
containing a fragment of the gene of Figure 1 which was derived 
using reverse transcription from mRNA. 

Figure 3 illustrates a scheme for obtaining a plasmid 
containing the gene of Figure 1 su* - ble for transformation 
into, e.g., E. col 1 for expression of human proinsulin. 

Figure 4 illustrates the analysis by HPLC chromatography of 
human proinsulin obtained by expression from an, e.g., £. col i 
transformant containing the p.lasmid derived from the scheme of 
Figure 3. 

Figure 5 illustrates segments of a gene for expression of 
an analog of human proinsulin differing from human proinsulin 
in the amino acid sequence of the C, bridging chain. 

Figure 6 illustrates a scheme for assembling a plasmid 
containing a gene for transformation into, e.g., Z. col i for 
expression of an analog of human proinsulin. 

Detailed Description 

A. Preparation of Human Proinsulin 

1. Preparation of Synthetic Gene Coding For The 32 
N-Terminal Amino Acids of Proinsulin 
a. Oligonucleotide Synthesis 

A series of 18 oligonucleotides, short nucleotide 
chains 10-12 units in length shown in Table 1, were prepared as 
a first step to the construction of a gene coding for the first 
32 amino acids of proinsulin, the amino acid sequence of which 
is shown in Fig. 1 as a part of the nucleotide sequence of the 
entire gene ultimately constructed for use in the present 
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invention for the expression of human proinsulin. The 
individual nucleotides in the gene are identified by the 
letters A, T, C or G representing the bases adenine, thymine, 
cytosine or guanine which distinguish one nucleotide from 
another. 

TA8LE 1 

01 igonucleotides: 

HI 
H2 
H3 
H4 
H5 
HO 
H7 
H8 
Bl 
B2 
B3 
B4 
B5' 
BS 
B7 
B8 
B9 
BIO 1 



AATTCATGTT 

CGTCAATCAGCA 

CCTTTGTGGTTC 

TCACCTCGTTGA 

TTGACGAACATG 

CAAAGGTGCTGA 

AGGTGAGAACCA 

AGCTTCAACG 

AGCTTTGTAC 

CTTGTTTGCGGT 

GAACGTGGTTTC 

TTCTACACTCCT 

AAGACTCGCC 

AACAAGGTACAA 

ACGTTCACCGCA 

GTAGAAGAAACC 

AGTCTTAGGAGT 

GATCCGGCG 
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The synthetic nucleotides are shown between 

brackets in Fig. 1. These oligonucleotides were 
synthesized by the triester method: R. Crea ejt al_. , Proc. Uat. 
Acad. Sci., USA T 75, 576S (1978), K . Itaicura et al_. , 0 . Biol. 
Chen. , 250, 4592 (1975) and K . Itakura et al_. , J. Am. Chem. 
Soc, 97_» 7327 (1975). Some of these are oligonucleotides 
which were used in a gene coding for the 8 chain of human 
insulin previously described by Crea et aT_. , Proc. Nat. Acad. 
Sci.. USA , 75_, 5755 (1973). and Goeddel et^ aK Proc. Nat. Acad. 
Sci., USA , 76, 106 (1979). The nucleotide sequences of two 
synthetic nucleoxides (85* and BIO*), were synthesized for this 
project; the others were prepared according to Crea ejt aK 
(supra.) The two new oligonucleotides, also prepared according 
to Crea et aj[. » incorporate restriction enzyme recognition 
sites for Hpa ll and terminal Bam HI , the latter used for 
cloning. The other end of the gene contains a sticky end of an 
Eco Rl site for cloning purposes. 

b. Joining of Synthetic Oligonucleotides 

The eight oligonucleotides H1-H8 were used 
previously to construct the left half of the B chain gene. This 
was used in this process and is described by Goeddel et aU t 
Proc. Natl . Acad. Sci. USA 76, 106 (1979). It contains the 
codons for the 1-13 amino acids of the 8 chain gene and a 
methionine unit at the N-terminal, used later to cleave the 
proinsulin from bacterially expressed chimeric protein using 
cyanogen bromide (CNBr). 

The right half of the 8 chain gene was obtained 
from the oligonucleotides 8^, B 2 » B3 , B^ , B 5 1 , Bg , 
B 7 , B 8 , B g and B ln ' , by ligation using T 4 ligase anu 
a technique described by Goeddel e_t aiK (supra). The gene 
fragment produced codes for the 14-30 amino acid units of the B 
chain and the first unit, Arg, of the bridging chain. 
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Incorporated into the gene sequence Is an Hoall restriction 
enzyme site in the same reading frame and location as an Hoall 
site in the human insulin gene. After purification of the 
ligated gene fragment by polyacrylaraide gel electrophoresis, 
and elution of the largest DNA band, the. f ragmen t was inserted 
into the plasmid pBR322 that had been cleaved with restriction 
endonucleases Hindi II and BamHI, thereby utilizing the Hindi 1 1 
and BamHI sites on the synthetic gene fragment. The DNA was 
inserted into E. coli 294 (ATCC Mo, 31445) by transformation. 
One plasmid, pB3* recovered from an anpicillin resistant, 
tetracycline sensitive clone was found to possess the desired 
nucleotide sequence according to the method of A.M. Maxam 
et aK Proc. Natl. Acad. Sci USA 74 , 560 [1977). 

From the two plasaids pBH 1 and pB3', two DNA 
fragments were recovered, a 46 base pair EcoRi to Hindi 1 1 
fragment from pBHl. and a 58 base pair Hind II I to BamHI 
fragment from pB3 1 . 

The two fragments were ligated together to produce 
a fragment having an EcoRI site and- a BamH I site. This 
fragment was inserted in plasmid pBR322 which had been treated 
with EcoRI and BamH I restriction endonucleases using the method 
described in Goeddel et al_. Proc. Hat. Acad. Sci., USA , 75_, 106 
(1979) and cloned in E. coli K-12 strain 294 (ATCC No. 31446) 
to provide the plasmid pIB3. After cloning, the plasmid pIB3 
was cleaved with Eco RI and Hpa ll restriction endonucleases to 
recover the synthetic gene fragment (Fragment 1, Figure 3) 
containing the codo*ns for the N-terminal proinsulin amino acids 
preceded by a methionine codon as shown in Figs. 1 and 3. The 
synthetic gene was isolated by polyacryl anide gel 
el ectrophoresi s . 

2. Isolation of A cDHA GeneCoding For the 55 C-Terrainal 
Amino Acids of Human Proinsulin 
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The scheme for obtaining the cONA gene Is schematically ) 
shown In F1g. 2. 

A decanucleotlde was synthesized containing the 
recognition sequence for BamH I endonucl eases . to which was added 
a 3* polythymidyHc acid tract of approximately 20 residues. 
Its sequence is pCCGGATCCGGTT^T . This oligonucleotide was 
used to prime AHV reverse transcriptase for cDNA synthesis. 

The primer was prepared using terminal deoxynucl eoti dyl 
transferase (Enzo 81ochem, 200 units) with one pinole of the 
BamHI decanucleotlde in a reaction volume of 0.6 ml containing 
1.5 x 10" 4 TTP. The reaction was conducted at 37*C for one 
hour in a buffer system described by A. Chang et aK Nature , 
275 , 617 (1978). 

Human insulinoma polyA tissue (2.5 Pg) provided by the 
Institute fuer Diabetesf orschung , Muenchen, West Germany 
(Or, Wolfgang Kemmler) containing mRNA isolated by the process 
of Ullrich et al , Science , 196 , 1313 (1977) was converted to 
double stranded cDHA by a procedure according to Vickens et aK 
J . Biol. Chero, 253 , 2483 (1978). Thus, 80 ul containing 15 mM 
Tris/HCl (pH 8.3 at 42*C), 21 mH KC1 , flmH HgCl^ 30 mH 
a-mercaptoethanol , 2 mH of the primer dCCGGATCCGGTT 18 T, and 1 
nH dHTPs was preincubated at O'C. Then 40 units of AMV reverse 
transcriptase were added and the mixture incubated for 15 
minutes at 42*C. J 
The complementary cDNA strand was synthesized in a 1 
volume of 150 pi containing 25 mH Tris/HCl ( pH 8.3), 35 mH KC1 , 
4 mH HgCl 2 , 15 mH e-roercaptoethanol and 9 units of DMA ■ • 

polymerase I (Klenow fragment). The mixture was incubated at 
15*C for 90 minutes followed by 15 hours at 4*C. SI nuclease 
digestion was then performed for 2 hours at 37 # C using 1000 
units of SI nuclease (Miles Laboratories) as described by 
Wfckens et aU supra . The double stranded cOHA (0.37 pg ) was 
subjected to electrophoresis on an 8 percent pol yacryl ami dc 
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gel. UNA fragments larger than 500 base pairs were eluted. 
01 igodeoxycy tidyl ic acid residues were added to the 3' ends of 
the fragments using terminal deoxynucl eoti dyl transferase by 
the procedure of J.Y. Hafzel Jr., Heth, Virol. , 180 (1971). 
The dC tailed cDNA fragments were an.nealed to pBR322 that had 
been cleaved with the restriction endonuclease Ps t i and tailed 
with deoxyguanidyl ic acid using terminal deoxynucl eoti dyl : 
transferase. The resulting plasmids were transformed into 
E. col 1 K-12 strain 294 and cloned. Colonies resistant to 
tetracycline but sensitive to ampicillin were isolated and 
screened* for plasmids having three sites cleavable by the 
restriction endonuclease Ps t i indicative of the presence of the 
gene for insulin. Sures et^ aK Science , 208 , 57 (1980). 

One plasmid, pHI104, containing a 600 base pair insert 
and giving the anticipated Pst I restriction pattern was 
determined to contain a site cleavable by BamH I between the 3* 
polyA and the polyGC introduced during the cDHA preparation. 
Some of the nucleotide sequence of the Insert is shown 1n 
Fig. 1. This sequence differs slightly from that previously 
reported by I. Sures et aj_. Science . 208 , 57 (1980) and G. 
Bell, et aK Nature , 282 , 525 (1979), having an AT base pair 
where underlined rather than a CG pair, because the mRNA used 
v/as from tissue isolated from a different Individual. The 
resistance to antibiotics conferred on a bacterium by this 
plasraid is indicated by the marker Ap^ for ampicillin 
sensitivity and Tc r for tetracycline resistance. 

3. Assembly Of A Gene Coding For Human Proinsulin 

The scheme used for assembling a gene coding for human 
proinsulin is shown In Fig. 3. 

The synthetic gene segment coding for the first 31 
amino acids of proinsulin, fragment 1 in Fig. 3, was recovered 
from 50 ug of the plasmid pIB3 using the restriction endo- 
nucleases Eco Rl and Hpa l I as described above. This fragment 
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also contains the codon ATG for methionine in place of the 
• "presequence" of preproi nsul i n . Introduction of a methionine 
unit at this point permits the polypeptide ultimately expressed 
to be cleaved at this point by cyanogen bromide fCNBr) to 
separate the proinsulin from the residue of the polypeptide 
which served to protect the proinsulin portion from bacterial 
proteases. 

The cOlM gene segment coding for amino acids 32-86, as 
well as the translation stop codons and the 3' untranslated 
region of the mRWA was recovered from 40 P g of the plasmid 
PKI104 by treatment first with BamHI and then HpjU as shown in . 
Fig. 3 as fragment 2. 

The two fragments were Isolated by polyacryl ami de 
electrophoresis followed by el ectroel u ti on . The gene fragments 
were joined by treatment with T4 DMA ligase in 20 u l ligase 
buffer (Goeddel et aK Proc. Nat. Acad. Sci.. USA . 76. 106 
(1S79) at 4*C for 24 hrs. The mixture was diluted with 50 yl 
K 2 0. extracted with phenol, then chloroform and then 
precipitated with ethanol . 

The resulting DWA was treated with BamHI and EcoRI to 
regenerate these sites and remove gene polymers. The assembled 
proinsulin gene was isolated by polyacryl ami de gel electro- 
phoresis and ligated using T4 ligase to the plasmid pBR322 
which had previously been treated with EcoRI and BamHI. The 
resulting ONA was transformed into E- col i K-12 strain 294 and 
cloned. Colonies were screened using the plasraid conferred 
antibiotic resistance markers/ The desired clones were tetra- 
cycline-sensitive (Tc s ) and ampici 11 i n-res i s tant (Ap r ). 
Plasnid pHI3 was isolated from one such colony and the 
proinsulin was characterized by nucleotide sequence analysis 
and found to have the sequence shown in Fig. 1. 

4. Construction of a Plasmid Designed to Express a 
Chimeric Protein Containing the Human Proinsulin Peptide 
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Plasmid pBRHl, [R.I. Rodriguez, et al^., Nucleic Adds 
Research 6, 3257-3287 (1979) expresses anpicillin resistance and 
contains the gene for tetracycline resistance but, there being 
no associated promoter, does not express that resistance. The 
plasmid is accordingly tetracycline sensitive. By Introducing 
a promoter-operator system in the EcoRI site, the plasmid can 
be made tetracycline resistant. 

Plasmid pGMl carries the £. coli tryptophan operon 
containing the deletion LE1413 (G.F. Hiozzari, et aK , (1978) 
J. Bacteriology 1457-1455 ) ) and hence expresses a fusion 
protein comprising the first 6 amino acids of the trp leader 
and approximately the last third of the trp E polypeptide 
(hereinafter referred to in conjunction as IE'), as well as the 
trp' 0 polypeptide in its entirety, all under the control of the 
trp promoter-operator system. The plasmid, 20 »g, was digested 
with the restriction enzyme PvuII which cleaves the plasmid at 
five sites. The gene fragments £ were next combined with EcoRI 
linkers (consisting of a self complementary oligonucleotide 3^ 
of the sequence: pCATGAATTCATG) providing an EcoRI cleavage 
site for a later cloning into a plasmid containing an EcoRI 
site. The 20 ug of DMA fragments 2 obtained from pGMl were 
treated with 10 units of T 4 DMA ligase in the presence of 200 
pico moles of the 5 ' -phosphoryl a ted synthetic oligonucleotide 
pCATGAATTCATG, and 1n 20 P l T 4 DMA ligase buffer (20mM tris, 
pH 7.6, 0.5 mM ATP , 10 raM MgCl 2 , 5mH di thi othrei tol ) at 4'C 
overnight. The solution was then heated 10 minutes at 70*C to 
halt ligation. The linkers were cleaved by- EcoRI digestion and 
the fragments, now with £coRI ends were separated using 5 
percent polyacryl ami de gel electrophoresis (herein after 
"PAGE" ) and the three largest fragments isolated from the gel 
by first staining with ethidium bromide, locating the fragments 
with ultraviolet light, and cutting from the gel the portions 
»^ 
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of interest. Each gel fragment, wltn 300 microliters O.lxTBE, 
was placed In a dialysis bag and subjected to electrophoresis 
at 100 v for one hour in O.lxTBE buffer (TBE buffer contains: 
10.8 gm tris base, 5.5 gm boric add, 0.09 gm Na 2 E0TA in 1 
liter H 2 0). The aqueous solution was col 1 ected from the 
dialysis bag, phenol extracted, chloroform extracted and made 
0.2 M NaCl, and the ONA removed in H 2 0 after EtOH 
precipitation. 

pBRHl was digested with EcoRI and the enzyme removed by 
phenol extraction followed by chloroform extraction and 
recovered in water after ethanol precipitation. The resulting 
DNA molecule was, in separate reaction mixtures, combined with 
each of the three DNA fragments obtained above and ligated with 
T^ DNA ligase as previously described. The ONA present in 
the reaction mixture was used to tranform competent £. col i 
K-12 strain 294, K. Backman e_t aj_. , Proc Hat* 1 Acad Sci USA £1. 
4174-4198 [1976]) (ATCC no. 31446) by standard techniques (V. 
Hershfield et aK , Proc Nat'1 Acad Sci USA 71, 3455-3459 
[1974]) and the bacteria plated on LB plates containing 
20 ug/ml ampicillin and 5 pg/ml tetracycline. Several 
tetracycline-resistant colonies were selected, plasraid ONA 
isolated and the presence of the desired fragment confirmed by 
restriction enzyme analysis. The resulting plasraid was 
designated pBRHtrp. 

pBRH trp was digested wih EcoRI restriction enzyme and 
the resulting fragment isolated by PAGE and electroelution. 
EcoRI-digested plasmid pSOMall [K. Itakura e,t aK , Science 198, 
1056 (1977 ); G. B. patent publication no. 2 007 676 A) was 
combined with this fragment. The mixture was ligated with T 4 
DNA ligase as previously described and the resulting DNA 
transformed into E. coli K-12 strain 294 as previously 
described. Transformant bacteria were selected on 
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ampfclll in-containing plates. Resulting arapicil 1 In-resistant 

colonies were screened by colony hybridization (M. Gruenstein 

et aK. Proc Nat'l Acad Sci USA 72. 3951-3965 [1975]) using as 

a probe the trp promoter-opera tor-contai n ing fragment isolated 

from pBRHtrp, which had been radi oac ti vel y labelled with 
32 ^ 

P . Several colonies shown positive by colony hybridization 
were selected, plasraid OHA was Isolated and the orientation of 
the inserted fragments determined by restriction analysis 
employing restriction enzymes Bglll and BamHI_ in double diges- 
tion. E. -co_M 294 containing the plasmid designated pS0H7 4 2. 

Plasmid pBR322 was HindHI 'digested and the protruding 
HindHI ends in turn digested with SI nuclease. The SI 
nuclease digestion involved treatment of 10 M g of 
Hindlll-cleaved pBR322 in 30 „1 SI buffer (0.3 M MaCl , 1 siM 
2nCl 2 , 25 mM s odi ura ' ac e ta te , pH 4.5) with 300 units SI 
nuclease for 30 minutes at IS*C. The reaction was stopped by 
the addition of 1 pi of 30 X SI nuclease stop solution {0.8M 
tris base, 50 mM £DTA). The mixture was phenol extracted, 
chloroform extracted and ethanol precipitated), then EcoRI 
digested as previously described and the large fragment 1' 
ootained by PAGE procedure followed by el ectroel ution. The 
fragment obtained has a first EcoRI sticky end and a second, 
blunt end. whose coding strand begins with the nucleotide 
thymidine. 

16 u g Plasraid P SOM7a2 was diluted into 200 „1 of buffer 
containing 20 mM Tris. P H 7.5, 5 M MgClg, 0.02 percent NP40 
detergent, 100 mM NaCl and treated with 0.5 units EcoRI. .After 
15 minutes at 37'C, the reaction mixture was phenol extracted, 
cnloroform extracted and ethanol precipitated and subsequently 
digested with Ogl II. The larger resulting fragment 3' was 
isolated by the PAGE procedure followed by el ectroel ution. 
This fragment contains the codons "LE'( P r for the proximal end 
of the LE' polypeptide, i.e., those upstream from the Ogl II 

BAD ORIGINAL 
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Site. The fragment 3* was next Hgated to the fragment 4*. 
Fragment 4' is prepared by successive digestion of P Trp24 
(prepared upon EcoRI digest of pThol ( Biochemistry 80. 6096 
(1980)) followed by Klenow polymerase I reaction to blunt the 
EcoRI residues. Bgl II digestion creates a linear fragment which 
was. ^circularized by reaction with the LE' containing Bgl II - 
sticky and blunt ends) with BgllX and EcoRI, followed by PAGE 
and electroelution. The ligation was done in' the presence of 
T 4 OHA ligase to form the plasmid P S0M7a244, which was ' 
transform into E. coM strain 294, as previously described. 

Plasmid pS0M7 4 2 was Bgl II digested and the Bgl II 
sticky ends resulting made double stranded with the Klenow 
polymerase I procedure using all four deoxynucleotide 
triphosphates. EcoRI cleavage of the resulting product 
followed by PAGE and electroelution of the small fragment 2' 
yielded a linear piece of DNA containing the tryptophan 
promoter-operator and codons of the LE ' "proximal" sequence 
upstream from the Bgl II site ("LE'(p)"). The product had an 
EcoRI end and a blunt end resulting from filling in the Bgl II 
site. However, the Bgl II site is reconstituted by ligation of 
the blunt end of the fragment V to the blunt end of fragment 
1*. Thus, the two fragments were ligated in the presence of 
T 4 D!IA ligase to form the recircularized plasmid pHKT lo 
which was propagated by transformation into competent E. coU_ 
strain 294 cells. Tetracycline resistant cells bearing the 
recombinant plasmid pHKY 10 were grown up. plasmid OHA 
extracted and digested in turn with Bgl II and Pst followed by 
Isolation by the PAGE procedure and electroelution of the large 
fragment, a linear piece of DMA having Pst and Bgl II sticky 
ends to give OKA fragment 7/. 

Plasmid pS0H7a2 4 4 could be manipulated to provide a 
second component for a system capable of receiving a wide 
variety of heterologous Structural genes. The plasmid was 



-21- 



0055945 



subjected to partial EcoRI digestion followed by Pst digestion 
and the fragments containing the trp promoter/operator was 
Isolated by the PAGE procedure followed by el ec troel utlon , 
Partial EcoRI digestion was necessary to obtain a fragment 
which was cleaved adjacent to the 5 1 end of the somatostatin 
gene but not cleaved at the EcoRI site present between the 
ampicillln resistance gene and the trp promoter operator. 
Ampicillin resistance lost by the Pst 1 cut in the ap R gene 
could be restored upon ligation with fragment 5'. 

In a first demonstra ti on • the third component, a 
structural gene for somatostatin {£' ) was obtained and purified 
by PAGE and el ec troel ution. 

The three gene fragments V , 5/ and 6* could now be 
ligated together in proper orientation, to form the plasmid 
S0M7aU4. 

The complete human proinsulin gene, including the 
N-terrainal codons that code for methi oni ne, was recovered from 
the plasmid pHI3 by treatment with Eco RI and BamH I and purified 
by gel electrophoresis. This gene, fragment 3 in Fig. 3, was 
joined to two other DNA fragments with T4 ligase; these are 
identified as fragments 4 and £ in Fig. 3, 

Fragment 4' contains a promoter and a carrier protein 
gene derived from the plasmid pS0M7Al&4 by partial digestion 
with EcoRI and complete digestion with Pst I . This fragment 
contains an £. col 1 tryptophan (trp) promoter-operator, nine 
codons from the trp leader peptide, 190 codons from the trp E 
gene and an Eco RI cleavage site introduced in place of the trp 
E termination codon. (This gene construction will be referred 
to as trp LE' below.) The tryptophan attenuator region 
including the last 5 codons of the trp leader peptide sequence 
and the first two thirds of the trp E gene are deleted in this 
construction. 
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The trp E gene (trp LE'), contained 1n Fragment , is 
modified to incorporate an Eco RI site in place of the 
termination codon of the trp E gene as shown to give the 
correct reading frame with the inserted gene fragment 3^ 

This fragment is bounded at the opposite end by a 
Ps t l site derived from the p8R322 and incorporates the first 
half of the s-lactaraase gene. The fragment was recovered from 
20 pg of plasmid pS0M7aU4 by partial digestion with Eco RI 
followed by treatment with Ps t l . The promoter containing 
fragments were isolated by polyacryl ami de gel electrophoresis. 

Fragment £' was obtained from plasmid pHKYlO. This 
plasmid is a derivative of pBR322 and contains a tryptophan 
promoter-operator in place of the tetracycline promoter. The 
Hind i 1 1 site of pBR322 has been converted to a Bgl II site. The 
plasmid, 20 pg, was treated wtih Ps t l and Bgl 1 1 and the large 
fragment, designated 5_ in Fig. 3, purified by polyacryl ami de 
gel el ec trophores i s . 

The two fragments £ and 5 v*»re ligated together to 
reassemble the gene for 8-lactamase via a Ps t l site and confer 
arapicillin resistance (Ap r ). The ends then present an EcoRI 
site and a Bgl II site for insertion of a gene. These two sites 
cannot be ligated together due to nonhybri di za tion of the 3' 
protruding ends and can only be Joined by incorporating a DNA 
fragment that possesses 3' ends complementary to the Eco RI and 
Bgl I I ends. The proinsulin gene, fragment 3 containing Eco RI 
and B a m H I ends, is such a molecule. Thus, the three fragments, 
5 pg of 4_, 1 pg of 3_ and 1 pg of £ were combined and treated 
with T4 DNA ligase at 4"C for 24 hours in ligase buffer. Upon 
ci rcul arization to close the plasmid, the tryptophan promoter- 
operator controls expression of a fusion (chimeric) protein of 
which proinsulin is a portion. Tetracycline resistance (Tc r ) 
is also conferred*. 
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The DNA mixture from the ligation was transformed in 
£• ££li» K " 12 stra * n 294 by the procedure of Goeddel et al., 
Nature . 281 , 541 (1979). Colonies were selected that would 
grow on both ampicillin and tetracycline. Of 3 colonies 
tested, 2 were found by SOS polyacryl ami de gel electrophoresis 
(J.F. Maizel, Jr., Heth. Virol. , S, 180 (1971)) to express a 
protein of the molecular weight expected of the trp L£'- pro- 
insulin fusion. One plastnid, pHI7 was completely characterized 
as to OH A sequence of the incorporated gene and restriction 
analysis of the vector pBR322. 
5. Pro insulin Isolation 

The plasmid pHI7 was transformed into E. col i K-12 
strain RV308 (ATCC Wo. 31608) and grown in 500 ml of LB medium 
(O.H. Miller, Experiments In Molecular Genetics, 433, Cold 
Spring Harbor 1972) containing 10 ug/ml of ampicillin to a cell 
density of 5 0D. This was diluted into a 10 liter fermentation 
vessel (New Brunswick) and grown in H9 media (Miller, supra , at 
431.) to a cell density of 14 00. Cells were collected by 
cen tri f uga ti on and frozen. 

Cells ( 1 64 g ) were thawed in 5 volumes sucrose lysis 
buffer (10 percent sucrose, 0.1M tris HC1 , pH 7.9, 50mM EDTA, 
0.2M Had ) containing O.lmM phenyl methyl sul fonyl fluoride and 
1.0 mM dimercaptopropanol , and lysed by sonication. The lysis 
pellet was collected by cen tri f uga ti on and suspended by 
stirring overnight at 4*C with 4 volumes of 7.0M guani dine-HCl , 
ImM dimercaptopropanol, ImM E0TA. After cen tri f uga ti on the 
supernatant was diluted 20 times with cold water and allowed to 
stand 2 hours at 4 # C. The precipitate (9.6g dry) was collected 
by centri fugation and reacted overnight at room temperature in 
220 ml 88 percent formic acid with 5g CHBr to cleave the 
proinsulin from the trp LE 1 fusion. After rotary evaporation 
the residue was suspended in 200 ml 7.5M urea, ImM EDTA, 20nM 
ammonium carbonate, and the pH adjusted to 9.0 with 
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ethanolaralne {5.5 ml). Ten grams of sodium sulfite and five 
grams of sodium te tra thtonate were adaed to. convert cysteines 
and cystines to S-sulfonate groups and the reaction stirred at 
room temperature for 5 hours. 

The reaction mixture was desalted on a G-25 Medium 
column in 7.5M urea, 10nM Tris pH 8,5, lOmM £DTA. The desalted 
protein was loaded onto a DEAE-sephadex (A-25J column and 
eluted with a 2-1 iter linear gradient of 0 to 0.5 M NaCl in the 
same tris-urea buffer. The proinsulin like material, 
identified by RIA or HPLC {see below), was concentrated on an 
Amicon YM5 membrane and resolved in the tris-urea buffer on a 
G-50 Medium column. The G-50 fractions, identified by HPLC, 
v/ere pooled (104ml) and the buffer charged on a column of 6-25 
Fine equilibrated with 30 mM ammonium carbonate, pH 8.8. The 
lyophilized protein weighed 216 mg. The recoveries at each 
step are shown in Table 2. 
6. Proinsulin Analysis 

The S-sulfonated proinsulin obtained was analyzed by 
amino acid analysis. This analysis was made by Eli Lilly and 
Co. and is shown in Table 3. 

TABLE 3 





Amino Acids 
Cal cul a ted 


Amino Acid? 
Predicted 




Amino Acids 
Cal cul ated 


Amino Acids 
Predicted 


Asp 


4.40 


4 


lie 


1 .34 


2 


Thr 


2.90 


3 


Leu 


12.21 


12 


Ser 


4 .50 


5 


Tyr 


3.93 


4 


Glu 


15.64 


15 


Phe 


2.61 


3 


Pro 


3.42 


3 


His 


2.02 


2 


Gly 


11.08 


11 


Lys 


1.96 


2 


Al a 


4 .46 


4 


Arg 


3.92 


4 


Cys 


2.85 


6 


Val 


5.58 


6 



0055945 

-26- 



The presence of proinsulin was also confirmed by radio- 
immunoassay. To determine the radiolmmunoactivity the Corning 
125 

I-insulin kit was used. The antibody was found to be about 
4 percent cross-reactive wi.th proinsulin and about 0.2 percent 
cross-reactive with reduced proinsulin. Unknowns were heated 2 
minutes at 90'C, in 7.5H urea, 2 mH s-mercaptoe thanol , pH 8-10 
(ethanol amine) , and al iquots dil uted in phosphate-buffered 
saline {0.1 gelatin) and immediately assayed. These results 
were determined from comparisons to a reduced proinsulin 
standard curve generated in the same way from either bovine 
proinsulin or human proinsulin 5-sulfonate. 

The proinsul in-S-sul fonate was also assayed by HPLC. In 
Fig. 4 profiles are shown for S-sulfonated Bovine proinsulin, 
bacterial derived human proinsulin and a combination of the 
two. In this analysis, samples of bovine, and human proinsulin 
sulfonate and a mixture of the two were applied to a 10 m 
RP-189 column and eluted using a linear gradient of 21 to 33 
percent n-propanol and acetonitrile (2:1) in 50 mM^HH OAc 
(pH7). The proteins are seen to run very nearly coincident. 
The large A 2?g peak at the end of the chroma togram is due to 
rapid changes in the solvent composition and not eluted protein. 

B. Preparation of Proinsulin Analog 

Described below is the synthesis of a gene which codes for 
the expression of an analog of proinsulin comprising the A and 
B chains of human insulin connected by a bridging chain which 
differs from the C chain of human proinsulin in that it 
contains only 6 amino acid units rather than the 35 unit 
polypeptide of human proinsulin shown in Fig.l. Specifically 
the 6 units are, reading in order from the last unit of the B 
chain to the A chain, Arg-Arg-Gly-Ser-Ly s-Arg . This sequence 
has the same end sequences Arg-Arg and Lys-Arg as does human 
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proinsulin thus permitting excission of the bridging chain by 
proteolytic means. 

A chain of 6 amino acids is an acceptable length for a 
modified (or analog) bridging fc C chain which will permit 
folding and the subsequent formation of the disulfide 
crosslinks between A and B chains characteristic of hormone 
insulin. However, those skilled in the art will appreciate 
that bridging chains shorter or longer than 6 would also be 
useful as well by permitting folding and the formation of the 
necessary disulfide bonds. Sequences of 100 or even more amino 
acid units can be employed in the bridging chain. However, the 
practical difficulty of obtaining gene fragments coding for 
very long sequences makes the bridging chain analogs of fewer 
than 35 amino acid units more attractive from a practical point 
of view. 

The ends of the bridging chain, no matter how many 
intermediate amino acid units or in what order, must be 
constructed to permit excission of the bridging chain. 
Although alternative means may be employed, we prefer to use 
the sequences Arg-Arg and Arg-Lys as found in proinsulin itself 
as proteolytic cleavage using trypsin and carboxypepti dase B 
occurs cleanly at these sites. 

1. Preparation of Synthetic Gene Coding for the 57 Amino 
Adds of an Analog Proinsulin 

a. Oligonucleotide Synthesis 

The chemical synthesis methods as well as the 
synthesis of the DHA gene fragments coding for the A and 8 
chains of human insulin have been described. K. Itakura 
et aK , J . Biol . Chen. . 250 , 4592 (1975 ), K . Itakura e£ aj_. 
Biol. Chem ., 250 , 4592 (1975), K . Itakura e_t a_K J. Am. Chem. 
Soc. , 97 , 7327 ( 1975 ), Crea et al_. Proc. Mat. Acad. Sci.. USA, 
75, 5765 (1978) and Goeddel et al. Proc. Hat. Acad. Sci. t USA. 
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76, IOC (1979). Five new oligonucleotide fragments were 
synthesized by similar methods also using the triester process 
described in the above cited references. These sequences are 
shown below in Table 4 and Fig. 5. 

TABLE 4 

01 igonucl eotides 

AAGACTCGTCGTG 
GATCCAAGCGTGGCATC 
GATCCACGACGAGTCTT 
CAACGATGCCACGCTTG 
TCGACTATTAGTT 

b. Joining of Synthetic Oligonucleotides 

Figure 5 shows the synthetic oligonucleotides of 
the insulin A and B chain genes previously prepared and the 
manner in which the new fragments Cj-Cg were used in the 
enzymatic construction of a complete gene for a proinsulin 
analog. The scheme for obtaining this gene is set forth in 
Fig. 6. 

A plasmid pBHl containing the left half of the B 
chain gene was used in this process and is described by Goeddel 
et al_. . Proc. Wat. Acad. Sci.. USA, 76. 106 (1979). It 
contains the codons for the 1-13 amino acids of the B chain and 
a methionine unit at the N-terminal which will be used later to 
cleave the proinsulin analog from the bacterially expressed 
chimeric protein using cyanogen bromide (CNSr). 

The gene for the right half of the B chain was 
obtained from the oligonucleotides 0^ % b 2 , B^ , B., Cj, 
B 6 and B ? , Bg , B g and C 3 (Cj and C 3 replacing the 
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B s and B 1Q sequences in the previously prepared gene frag- 
ment) by ligation using T4 llgase in. conventi onal techniques. 
This gene fragment codes for the 14-30 amino acid units of the 
B chain and the first two units Arg-Arg of the bridging 
(modified "C") chain-pBCl. 

After purification by polyacryl ami de gel electro- 
phoresis and elutionjf the _urgest ONA band the fragment was 
inserted into the plasmid p8R322 that had been cleaved with 
restriction endonucleases Hlndlll and BamHI. thereby utilizing 
the HindHI and BamHI sites on the gene fragment, and- cloned in 
E. COH 294 (ATCC No. .31446). The plasmid P BC recovered fro. 
an ampicilTin-reslstant, tetracycline sensitive clone possessed 
the desired nucleotide sequence according to the method of a.m. 
Maxam et aj_. , Proc. Nat. A cad. Sci.. USA. 74, 560 (1977). 

The A gene was constructed similarly from 
oligonucleotides C 2 . A 2 , A3, A,. A s , A fi , C 4 , 
V V A io« A ll and c 5 (C 2 . C 4 and C s 
replacing the Aj , A 2 and A 12 sequences in the previously 
prepared gene fragment) using T4 llgase in conventional 
techniques. This fragment codes for th* 21 amino acids of the 
A and the four units of the bridging'C chain, Gly-Ser-Lys-Arg . 
After purification, also by polyacrylamlde electrophoresis, the 
fragment was inserted into the plasmid pBR322 which had been 
cleaved with restriction endonucleases Eco_ RI and SaM using the 
EcoRI and Sal_I sites on the fragment and cloned in E. coU 
294. An ampicHlin-resistant. te tracycl ine-sens i ti ve clone 
yielded plasmid pCAlB having the desired nucleotide sequence by 
the method of A.U. Maxam et al_. , supra . 

2. Construction of the Proinsulin Analog Gene and 
Corresponding Expression Plasmid 

The desired expression plasmid was prepared from 
plasmids pBHl. pBCl and pCAIB as shown in Fig. 6. The plasmid 
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pBHl was cleaved with Hind i 1 1 and ligated to the fragment BC 
excised from plasmid p8Cl by treatment with Hind i II and BamHI. 
The resulting plasmid pBC135 was cleaved with Eco RI and ligated 
to an Eco RI fragment of p lacS which contains the ra£ control 
region and the majority of the B-gal actosi dase structural gene 
(designated Z). K. Itakura et aK Science , 198. 1056 (1977). 
This ligation produced plasmid pIB254 which was cleaved with 
BamHI, Sal I and alkaline phosphatase. The product of this 
cleavage was ligated to fragment CA excised from plasmid pCA18 
by treatment with BamH I and Sai l as shown in Fig. 6 and trans- 
formed into E.. coli 294 and cloned. The plasmid pBCAS was 
recovered from ampicil 1 in-resistant, tetracy cl i ne-sensi ti ve 
clones in £. coli 294 grown on X-gal plates containing 
atnpicillin and contained the DNA coding sequence for the 
proinsulin analog as indicated by the method of A.M. Maxam 
et al . , supra . 

3. Expression of Proinsulin Analog 

The fully characterized plasmid p8CA5 was inserted into 
£• col i RV308 and grown in four-liter flasks containing 1.5 
liter LB containing 20 mg/1 ampicillin. Recovered cells (322 g 
wet) were lysed by sonication in two liters of 10 percent 
sucrose, 50 raM EDTA. 0.1 H tris/HCl, pH 7.9, 0.1 N 
phenylraethyl sul fonyl fluoride, 0.2 H HaCl , and 1 mH 
1,3-di thio-2-propanol . After cen tri f uga ti on (30 min, 5000 rpm) 
the pellet was suspended by stirring in 400 ml 7M guanidine 
hydrochloride, 0.1 mM, 1 ,3-di thi o-2-propanol , 1 mH EOTA . This 
suspension was centrifuged (30 rain, 12,000 rpm) and the 
supernatant diluted six-fold into cold water. The 
precipitation protein (12.4 g dry weight) was collected by 
centrifugation (20 min, 5000 rpm) and treated overnight at RT 
with 2.8 g (26.4 ranol ) CHBr in 200 ml 88 percent formic acid to 
cleave the proinsulin analog (hereinafter "anal og- ' C ' " 
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pr.JM.Mn) fro. the actosld.se residues at the 

«-.thlo-l„. unit. After rotary evaporation to dryness at 
un«er 30'C, water was added and the vo, urae again reduced. The 
rescue was suspended In 300 nl . „ guanldlne hydrochloplde ^ 
t h epH adjusted to 9 .o wn h et han o, ara , ne . 

15.0 9 sodlu. sulfite and 7.5 , sod fura tetrathlonate to convert 
seines and cystines to cysteine S-suHo„ate .roups and the 
-<«.r. allowed to react at roo* te rap erature for six hours 
The reaction ra l, ture was exhaustfyely fJ / 

a 9 a,nst 1 raM EDTA tt 4 . c ,„„ ^ precfpUated ^^.^ 
S-suUonates ejected hy cen t rl fugation (20 nin> 5ooo pp-J 
The penet was suspended 1„ 50 7.5 aH Tri - s/HC]f p „ ? 5 

HUered and loaded onto a 0E-5 2c o lum n (2 .S,87c m , 1„ ^ 

same buffer at 4"C Th n ~*<\ 

C The column was eluted with a linear 

gradient of 0 to 0.5 M Had ;„ *k 

n r/ati 1n the same buffer Th» ~- 

UI rer * rh e presence 

...«-« s .„, f0 „ ate .„ ,.„«,.,„,.,„„ 

"""** """"" " "»«• « «»• M- of „. ,0 

;v"" ,r4ct, ° w,,M "* 6 -" f »•*-. ..... 

«.»..«. ... ... rM . 5 „.„„, sm , ipii 

'•5. 1 « EOTA. ,>«„„„ „„„.,„, 

"«.M.l .„. „.,.. M >sifnst 2|) ^ 

p« „.„„.„„„. „. ..„. (0 , a , r ^ 

"""" ' " ■» •■■•"<» «*...,.. ,„ 

The analog-"C" proinsuli. S-suIfonate was purified by 

" " >">■'>"•<«•" <" c. „,„.„, Ille ' 

ph , t o.« ,.„..„.,. „ a , „„„,„_ „ pt01Smi tf • 

«"'"' •» »• - » .„,.,„ 

injections via a 500 u l loop. 
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TABLE 5 

Amino Add Analysis of 24-hour SH HC1 hydrolysis (110*) of 
purified analog-"C" proinsulin. Cysteine was quantitated by 
separate determination of cysteic add on perforraic add 
oxidized sample and calculated by cy stei c/al ani ne ratio. 
Yalues Increased to compensate for add decomposition were 
serine (10 percent) and threonine (5 percent). M.O. = not 
determi ned . 



Amino Acid aa mole percent x 57 amino adds 

predicted 



Ala 


1.19 


1 


Arg 


4.05 


4 


Asp 


3.17 


3 


Cys/2 


4.87 


6 


Glu 


6.95 


7 


Gly 


5.37 


5 


His 


2.03 


2 


He 


1.39 


2 


Leu 


5.98 


6 


Lys 


2.02 


2 


Met 


0 


0 


Phe 


3.03 


3 


Pro 


1.96 


1 


Ser 


4.21 


4 


Thr 


3.09 


3 


Trp 


H.O. 


0 


Tyr 


3.99 


4 


Val 


3.74 


4 
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4. Folding Of Analog - C - ProfnsuMn 

Ana,og "C- prof n 5U Hn fo, df ng to obtain a crossnn.eo- 

ZZ accor,,p1,shed by react1on of 4 m9 of — «* 

was dissolved ln-a degassed buffer of 40 raH glycine, pH 10.6 3 
« urea. 0.3 H MaCl at 0* Tn 

•to. To this was added 8-mercaptoethanol 
to a concentration of 6.4. „ and the reaction sea le d under 

"crease l„ RIA activity and was complete within about four 

The reaction .as stopped by the addition of acetic acid 

r " Ctl0n ni < tur * »as purified by HPU by 
Prep-coll ect i onfronaC . lflultrosphereco)uran _ Theresoi 

—lent was 21 - 28 percent acetonitrne in 0. 2mararaonium 
sulfate. 50 mM NaOAc, pH 4 0 a t n , 

P ' " °' 2 P er «"t/ ra in and 1.0 ra , /lnin . 
5. Assay For Analog "C" Proinsulin 

nsul1„rad1ol an uno assay as tbes-sulfonate. However, tbe 

of analog -e- proinsulin as an expression product was 
confined by crossreactlvlty of the thfo] f _ ^ 

—captoetbanol. diluted Into RIA buffer (o.i h sodium 
Phosphate. P H 7.4. 0.15M HaC ,. 0A percer)t 

oelatin, and 1 Dm ediate ly assayed ' 3> ^ PerCent 

y assayed. Reproduceabil ity depends 
upon strict timing since extendi < 

extended mcubation of the diluted 
reduced test solution leads to variable n •„ • 
, h , variable oxidative folding of 

the no iecule into for«s with hig her BIA activUy> 

The thiol forn gave an activity of n o 
tn . , «iv, t y of 0.9 percent compared 

:;"" ii, "" t '"- 

' / f "* ,M """"■*" "•■««■«*. r „. 

tnird form of the B rhain 

teBCha,n ° f P"c,ne proinsu.in had activity of 
0.1 Percent. On with a si i ght excess of 
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B-mercaptoethanol to obtain the folded, crosslinked product, 
reaction mixtures show RIA activity of 20-40 percent that of 

insul in. 

Preparation of Human Insulin 

1. Folding and Linking of A and B Chains. 

The human proinsulin or analogs thereof, prepared in 
accordance with the present invention, for example, following 
the procedures of Parts A or B above, as their respective 
S-sulfonates are Induced to fold with proper formation of 
Internal disulfide bonds (between cysteine 7 B and cysteine 
? A, between cysteine ig B and cysteine 20 A, and between* 
cysteine 6 1X A) by means of controlled sulfhydroxyl 
interchange catalyzed by a-mercaptoethanol . 

To a 0.1 mg/ml solution of proinsulin S-sulfonate in 
degassed 50 mM sodium glycinate, pH 10.6, at 4*C was added 
B-mercaptoethanol to a final concentration of 0.3 mM. After 
four hours, the reaction is essentially complete as measured by 
the Increase in cross-reacting activity of the mixture in 
insulin RIA . The yield of proinsulin is about 80 percent. 
Proinsulin is then purified from side products by gel 
permeation, ion exchange, and/or reverse phase high pressure 
liquid chromatography to yield product in substantially 
purified form. 

2. Excision of Bridging Chain 

The human proinsulin or analogs thereof, prepared in 
accordance with the present invention, for example, following 
the procedure of Part C. 1. above, are proteolytically converted 
to insulin for example in accordance with the procedure of 
Kemmler et aK , J. Biol Chem. , 246 , 6786 (1971). The obtained 
insulin is then purified by column chromatography or zinc 
crystallization to yield product in substantially 
form, identical to natural human insulin and freed of 
biologically active contaminants. 
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While the Invention In Its most preferred embodiment is 
described with reference to E. con, other microrganlsms could 
likewise serve as host cells, for example, yeasts such as 
Saccharornvm cerev^lflr . Bacilli such as Bacilli s„btiU< and 
preferably other enterobacterlaceae among which may be 
mentioned as examples Salmonella tv a hW<»n, and S£rrjLLU 
marc^safli, utilizing plasmids that can replicate and express 
heterologous gene sequences in these organisms. The 
invention is not to be limited to the preferred embodiments 
described. 
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CLAIMS : 

1. A chimeric polypeptide comprising: 

a) the polypeptide sequence of a proinsulin comprising 
the A and B chains of human insulin connected by a bridging 
chain of at least 2 ^amino acid units,- said bridging chain 
having sites at each end which permit its excision from 
between said A and B chains; and 

b) an additional protein or protein fragment; 

there being a cleavage site at or adjacent said additional 
protein or fragment and adjacent one end of the polypeptide 
sequence of said proinsulin. 

2. A chimeric polypeptide according to claim 1 wherein the 
amino acid sequence of the bridging chain corresponds to that 
of the C peptide of human proinsulin. 

3. A chimeric polypeptide according to claim 1 wherein the 
amino acid sequence of the bridging chain is Arg-Arg-Gly-Ser- 
Lys-Arg . 

4. A chimeric polypeptide according to claim 1 wherein the 
amino acid sequence of the bridging chain is Arg-Arg* 

5. A chimeric polypeptide according to claim 1 or claim 2 
wherein the sites permitting excision of the bridging chain 
are the amino acid units Arg-Arg at the B chain end and Lys- 
Arg at the A chain end. 
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6. A chimeric polypeptide according to any one of claims l 
to 5 wherein said cleavage site is a methionine unit. 

7- A chimeric polypeptide according to claim 6 wherein the 
methionine unit is adjacent the N-terminal of said proinsulin. 

3- A chimeric polypeptide according to any one of claims 1 
to 7 wherein the additional protein or fragment i s either 

a) at least a substantial portion of 3-galactosidase; or 

b) a fragment of the trp leader polypeptide fused to a 
portion of the trp E polypeptide. 

9. A process of producing a chimeric polypeptide according 
to claim 1 comprising the steps:- 

1) inserting a gene coding for said proinsulin into a 
Microbial cloning vehicle in which the gene is in reading 
Phase with a DNA se.uence coding for said additional protein 
or fragment comprising said cleavage site; 

2) transforming said cloning vehicle containing said 
inserted gene into a microbial host for expression of said 
chimeric polypeptide; 

3) expressing the chimeric polypeptide; and 

4) isolating the expressed chimeric polypeptide. 

10. A process for producing a proinsulin comprising the 
P-cess of claim 9 and the additional step of cl eavin . Q said 
chimeric polypeptide to release said proinsulin. 
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U. A process for producing a protein comprising the proce 
of claim 10 and the additional step of 
chain from said proinsulin. 



ss 

excising said bridging 



12. A process according to claim 11 wherein the excision of 
the bridging chain is preceded by the formation of disulfide 
bonds between the A and B chains and the product of excision- 
is human insulin. 



13. Human proinsulin when prepared by the process of cl 
10. 



aim 



14. Human insulin when prepared by the process of claim 12. 

15. A cloning vehicle suited for transformation of a 
microbial host and use therein for expressing a chimeric 
polypeptide according to claim 1. 

16. The plasmid pH17. 

17. The plasmid pBCAS. 

18- A viable culture of microbial transformants containing 
a cloning vehicle according to any one of claims 15 to 17. 

19. A method of producing human insulin comprising.- 1) 
cultivating a culture according to claim 18; 2, sesarating the 
resulting cellular mass; 3, isolating the precursors to human 
insulin comprising the chimeric polyp eptide according to cl£im 
1; 4, cleaving the additional protein therefrom; 5, effecting 
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folding and linkage of the A and B chains; and 6) excising the 
bridging chain. 

20. A product of microbial expression comprising the 
polypeptide sequence of a proinsulin comprising the A and B 
chains of human insulin connected by a bridging chain of at 
least 2 amino acid units and from which human insulin is 
derivable upon excision of said bridging chain. 
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mRNA 



V 



Reverse transcriptase + dNTP's 

+ Bom HI linker 

5 1 CC66ATCCGG(T) n 3* 



■< A >« * 

■ (T) fl 6GCCTAGGCC S 1 



Denature RNA/DNA hybrid 
Klenow Pol I * 



Bom HI 
r(A) n CC6SATCCGC3* 
-(TJ fl CGCCTAGGCC 5" 



SI Nuclease 

Terminal transferase + dCTP 



Eco Rt 



Bam HI 




IPst I 
Terminal transferase 
+ dGTP 




-<A) ft CCGWCCGGCCCCC 
-(TLGGCCTA6GCC 




FIG. 2. 
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